Summary


This report provides an in-depth analysis of the Materials Project batteries dataset. It includes an overview of the dataset, covering its size, column descriptions, and any missing values. Additionally, the report features various visualizations, predictive modeling, and an analysis of current trends in the battery industry.

Key Findings

  • Dominance of Lithium Batteries: Lithium batteries are the largest representative, outnumbering the second largest group by more than five times. Further analysis highlights their widespread use due to robust stability, high energy density, and minimal sensitivity to volume changes throughout their lifespan.

  • Aluminum as a Promising Alternative: While lithium batteries dominate the dataset, aluminum batteries exhibit similar qualities and, in some cases, even surpass lithium’s performance — particularly in volumetric capacity. However, aluminum batteries tend to experience greater volume changes and instability. The potential advantages of aluminum batteries make them a promising alternative; despite their trade-offs, they deserve further attention.


Libraries


library(dplyr)
library(readr)
library(ggplot2)
library(kableExtra)
library(gridExtra)
library(purrr)
library(corrplot)
library(tidyr)
library(plotly)
library(RColorBrewer)
library(DT)
library(caret)
library(randomForest)
library(factoextra)
library(fpc)
library(dbscan)

Essentials


Basic Dataset Statistics

  • Number of Rows: 4351
  • Number of Columns: 17
  • Total Missing Values: 0
No NULL or NA values are present in any of the attributes.

Overview of Categorical Attributes


Attribute Descriptions

Attribute Description
Battery ID Identifier of the battery.
Battery Formula Chemical formula of the battery material.
Working Ion Primary ion responsible for charge transport in the battery.
Formula Charge Chemical formula of the battery material in the charged state.
Formula Discharge Chemical formula of the battery material in the discharged state.

Summary Table

Battery ID Battery Formula Working Ion Formula Charge Formula Discharge
Length:4351 Length:4351 Length:4351 Length:4351 Length:4351
Class :character Class :character Class :character Class :character Class :character
Mode :character Mode :character Mode :character Mode :character Mode :character


Summary of Numeric Attributes


Attribute Descriptions

Attribute Description
Max Delta Volume Change in volume (%) for a given voltage step using the formula: max(charge, discharge)/min(charge, discharge) - 1.
Average Voltage Average voltage for each voltage step.
Gravimetric Capacity Gravimetric capacity, or energy per unit mass (mAh/g).
Volumetric Capacity Volumetric capacity, or energy per unit volume (mAh/cm³).
Gravimetric Energy Gravimetric energy density relative to the battery mass (Wh/kg).
Volumetric Energy Volumetric energy density relative to the battery volume (Wh/L).
Atomic Fraction Charge Atomic fraction of components in the charged state.
Atomic Fraction Discharge Atomic fraction of components in the discharged state.
Stability Charge Stability indicator of the material in the charged state.
Stability Discharge Stability indicator of the material in the discharged state.
Steps Number of distinct voltage steps from fully charged to discharged, based on stable intermediate states.
Max Voltage Step Maximum absolute difference between adjacent voltage steps.

Summary Table

Max Delta Volume Average Voltage Gravimetric Capacity Volumetric Capacity Gravimetric Energy Volumetric Energy Atomic Fraction Charge Atomic Fraction Discharge Stability Charge Stability Discharge Steps Max Voltage Step
Min. : 0.00002 Min. :-7.755 Min. : 5.176 Min. : 24.08 Min. :-583.5 Min. :-2208.1 Min. :0.00000 Min. :0.007407 Min. :0.00000 Min. :0.00000 Min. :1.000 Min. : 0.0000
1st Qu.: 0.01747 1st Qu.: 2.226 1st Qu.: 88.108 1st Qu.: 311.62 1st Qu.: 211.7 1st Qu.: 821.6 1st Qu.:0.00000 1st Qu.:0.086957 1st Qu.:0.03301 1st Qu.:0.01952 1st Qu.:1.000 1st Qu.: 0.0000
Median : 0.04203 Median : 3.301 Median : 130.691 Median : 507.03 Median : 401.8 Median : 1463.8 Median :0.00000 Median :0.142857 Median :0.07319 Median :0.04878 Median :1.000 Median : 0.0000
Mean : 0.37531 Mean : 3.083 Mean : 158.291 Mean : 610.62 Mean : 444.1 Mean : 1664.0 Mean :0.03986 Mean :0.159077 Mean :0.14257 Mean :0.12207 Mean :1.167 Mean : 0.1503
3rd Qu.: 0.08595 3rd Qu.: 4.019 3rd Qu.: 187.600 3rd Qu.: 722.75 3rd Qu.: 614.4 3rd Qu.: 2252.3 3rd Qu.:0.04762 3rd Qu.:0.200000 3rd Qu.:0.13160 3rd Qu.:0.09299 3rd Qu.:1.000 3rd Qu.: 0.0000
Max. :293.19322 Max. :54.569 Max. :2557.627 Max. :7619.19 Max. :5926.9 Max. :18305.9 Max. :0.90909 Max. :0.993333 Max. :6.48710 Max. :6.27781 Max. :6.000 Max. :26.9607


Values Distribution

Visual Analysis


Total Number of Batteries per Ion


Outlier Removal

To ensure clarity and accuracy in the following graphs, the function below was used to filter out outliers that could skew the results. Due to the presence of extreme values, removing these outliers ensures that the dataset more accurately reflects typical trends, allowing for a fair comparison across different Working Ion groups.

remove_outliers <- function(x) {
  Q1 <- quantile(x, 0.25)
  Q3 <- quantile(x, 0.75)
  IQR <- Q3 - Q1
  x[x >= (Q1 - 1.5 * IQR) & x <= (Q3 + 1.5 * IQR)]
}


Ion Average Voltage Characteristics


Ion Stability


Volume Fluctuations


Volumetric vs Gravimetric Energy


Volumetric vs Gravimetric Capacity


Atomic Fraction


Correlation Matrix


Notable Correlations for Analysis

Correlation Comment
Average Voltage & Gravimetric / Volumetric Energy (0.67) / (0.55) Both gravimetric and volumetric energy correlate with average voltage. Maximizing gravimetric energy boosts ion movement, resulting in higher voltage and energy states. Conversely, maximizing volumetric energy can slow ion movement, leading to lower voltage and increased heat dissipation, reducing usable energy output.
Atomic Fraction Charge / Discharge (0.60) Charging and discharging atomic fractions generally align, ensuring a balanced ion cycle. Minor imbalances could impact stability and longevity.
Stability Charge / Discharge (0.80) High stability correlation between charge and discharge phases enhances cycling reliability, crucial for long-term performance.


Model Predictions


Below, you will find model-based estimates generated by following parameters: Max Delta Volume, Average Voltage, Gravimetric Capacity, and Stability Charge in a controlled progression. This progression simulates potential future states to help visualize the projected changes in Gravimetric Energy. Each data point on the plot includes a tooltip showing the estimated energy along with the related values for each parameter over a hypothetical range.

Note: Model training was conducted without outliers to improve prediction accuracy and reliability


Model Estimations Graph


Parameter Impact Visualizations

Next four graphs illustrate the impact of each parameter on predicted energy by sequencing the selected variable while holding other variables at their average values.


Voltage Impact


Volume Delta Impact


Gravimetric Capacity Impact


Stability Charge Impact

Among the parameters studied, the trained model indicates that Average Voltage and Gravimetric Capacity have the greatest influence on predicted energy, while Max Delta Volume and Stability Charge show only slight and inconclusive variations in energy output.


Model vs Real Data

set.seed(5643)

sample_data <- batteries_data[sample(nrow(batteries_data), 10), ]

sample_input <- sample_data[, c("Max Delta Volume", "Average Voltage", "Gravimetric Capacity", "Stability Charge")]

sample_data$Predicted_Energy <- predict(model, newdata = sample_input)

result_data <- sample_data[, c("Battery ID", "Gravimetric Energy", "Predicted_Energy")]

knitr::kable(result_data)
Battery ID Gravimetric Energy Predicted_Energy
mp-756701_Li 112.98370 118.57894
mp-759500_Li 1013.87490 1006.20573
mp-759832_Na 557.57223 557.63370
mp-19395_Li 251.70507 246.12879
mp-510366_K 217.53634 215.47054
mp-25974_Li 273.43322 276.46133
mp-1044783_Zn 131.82913 129.93249
mp-1041066_Mg 841.78509 847.02604
mp-757896_Li 97.81006 83.97572
mp-771696_Li 311.71147 314.62681

The model generally predicts energy well, but it is vulnerable to outliers, such as the case of mp-757896_Li, where the predicted energy deviates noticeably from the actual value.